SIRUS: Stable and Interpretable RUle Set for classification

نویسندگان

چکیده

State-of-the-art learning algorithms, such as random forests or neural networks, are often qualified “black-boxes” because of the high number and complexity operations involved in their prediction mechanism. This lack interpretability is a strong limitation for applications involving critical decisions, typically analysis production processes manufacturing industry. In contexts, models have to be interpretable, i.e., simple, stable, predictive. To address this issue, we design SIRUS (Stable Interpretable RUle Set), new classification algorithm based on forests, which takes form short list rules. While simple usually unstable with respect data perturbation, achieves remarkable stability improvement over cutting-edge methods. Furthermore, inherits predictive accuracy close combined simplicity decision trees. These properties assessed both from theoretical empirical point view, through extensive numerical experiments our $\mathtt{R/C}\mathtt{++}$ software implementation $\mathtt{sirus}$ available $\mathtt{CRAN}$.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Bayesian Framework for Learning Rule Sets for Interpretable Classification

We present a machine learning algorithm for building classifiers that are comprised of a small number of short rules. These are restricted disjunctive normal form models. An example of a classifier of this form is as follows: If X satisfies (condition A AND condition B) OR (condition C) OR · · · , then Y = 1. Models of this form have the advantage of being interpretable to human experts since t...

متن کامل

An interpretable fuzzy rule-based classification methodology for medical diagnosis

OBJECTIVE The aim of this paper is to present a novel fuzzy classification framework for the automatic extraction of fuzzy rules from labeled numerical data, for the development of efficient medical diagnosis systems. METHODS AND MATERIALS The proposed methodology focuses on the accuracy and interpretability of the generated knowledge that is produced by an iterative, flexible and meaningful ...

متن کامل

Interpretable Two-level Boolean Rule Learning for Classification

This paper proposes algorithms for learning two-level Boolean rules in Conjunctive Normal Form (CNF, i.e. AND-of-ORs) or Disjunctive Normal Form (DNF, i.e. OR-of-ANDs) as a type of human-interpretable classification model, aiming for a favorable trade-off between the classification accuracy and the simplicity of the rule. Two formulations are proposed. The first is an integer program whose obje...

متن کامل

Methods and Models for Interpretable Linear Classification

We present an integer programming framework to build accurate and interpretable discrete linear classification models. Unlike existing approaches, our framework is designed to provide practitioners with the control and flexibility they need to tailor accurate and interpretable models for a domain of choice. To this end, our framework can produce models that are fully optimized for accuracy, by ...

متن کامل

A Margin-based Model with a Fast Local Searchnewline for Rule Weighting and Reduction in Fuzzynewline Rule-based Classification Systems

Fuzzy Rule-Based Classification Systems (FRBCS) are highly investigated by researchers due to their noise-stability and  interpretability. Unfortunately, generating a rule-base which is sufficiently both accurate and interpretable, is a hard process. Rule weighting is one of the approaches to improve the accuracy of a pre-generated rule-base without modifying the original rules. Most of the pro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Electronic Journal of Statistics

سال: 2021

ISSN: ['1935-7524']

DOI: https://doi.org/10.1214/20-ejs1792